This assignment is for ETC5521 Assignment 1 by Team Lorikeet comprising of Aryan Jain, Emily Sheehan, Jimmy Effendy, and DIYAO CHEN.
Measles is a highly infectious disease caused by the Measles virus. It can lead to pneumonia, infections of the middle ear, swelling of the brain and death.
A vaccine exists to prevent the onset of measles as there is no treatment. The vaccine involves the injection of attenuated measles antigens that stimulate the production of antibodies and memory cells, providing long-term protection against the virus. When administered properly, the vaccine is 90.5% effective within 72 hours of exposure (Barrabeig et al., 2011).
Unfortunately there is a growing number of individuals refusing vaccination, particularly in the US (Phadke et al., 2016). In Texas, the number of unvaccinated children attaining exemptions to attend school has increased by 28 times since 2003 (Sinclair et al., 2019). This has led to several outbreaks of vaccine preventable diseases, such as Measles. If this trend continues, there could be calamitous consequences.
This paper hopes to determine whether there is a relationship between socioeconomic status and MMR vaccination rate. Specifically, it explores how MMR vaccination rates fluctuate across different school types, states, income levels, enrollment numbers, educational attainment level, and proportion of foregin born populations. It will also compare the MMR vaccination rates against the overall vaccination rates.
Specifically, this paper hopes to answer the following questions:
Primary Question:
Secondary Questions:
First, the paper will discuss the data used and how it is prepared for the analysis. Then, analysis and findings about the research questions will be presented and discussed.
To analyse the relationship a dataset was retrieved from Wall Street Journal (WSJ). The data comprises of vaccination rates for 46,412 schools in 32 U.S states and was retrieved from The Wall Street Journal. The variables include; the school academic year, the school’s state, city, county, district, name, type, enrollment, MMR (measles, mumps and rubella) vaccination rate, overall vaccination rate, latitude, longitude and the percentage of students exempted from vaccinations due to personal, religious or medical reasons. The state health departments provided the vaccination data and the National Center for Education Statistic’s provided the school location, which was matched against the school name. In the case that there was no match, the school’s location was found with Google Maps API.
Additional data from the U.S. Census Bureau is also retrieved for 2018 county level income per capita, educational attainment level, and the number of foreign-born population. This was done by utilizing Census data API provided by the Census Bureau as well as with tidycensus package.
One of the limitation of the WSJ measles data is that there is inconsistencies in data collection methods. The data was collected in the 2017-18 school year for 11 states, but for the remaining 21 states, it was collected in 2018-19 school year. Moreover, with the help of naniar package, it can be easily identified that this dataset has a considerable amount of missing values. Although every precaution has been taken to ensure accurate figures have been calculated, some of the MMR rates, overall vaccination rates and school types were missing from the original dataset. The following variables are largely unusable as due to its high number of missing values:
xrel: the percentage of students exempted from vaccinations due to personal reasonsxmed: the percentage of students exempted from vaccinations due to medical reasonsxper: the percentage of students exempted from vaccinations due to religious reasonsdistrict: school districtThe individual state dataset was scraped from the Tidy Tuesday Github repository and combined with the existing measles dataset with left_join to extract the longitude and latitude variables from it. Various functions from the rvest package were used to scrape the data including read_html and html_table.
A considerable amount of data wrangling needed to be done for the U.S. census dataset as they do not provide descriptions of what each variable represents (e.g. variable B19301_001 represents Income Per Capita). In addition variable county_state, comprising of county and state, needed to be added for the measle and U.S. census dataset. This variable is used as a key to merge the measles and U.S. census dataset. This is achieved by utilizing tidyverse and janitor packages.
In comparison to other school types, the tuition fee for private schools are generally higher than public school (Kerr 2019). This partly due to the fact that public school receive funding from the government while private school are privately funded.
| Type | Average MMR Vaccination Rate (%) |
|---|---|
| BOCES | 98.75 |
| Public | 96.16 |
| Nonpublic | 94.38 |
| Kindergarten | 94.21 |
| Private | 93.32 |
| Charter | 87.96 |
Table 3.1 shows the 2018/2019 average MMR vaccination rates across different school types in USA. The table shows that Boards of Cooperative Education Services (BOCES) and public school have the highest rate of MMR vaccination rates compared to other school types. In contrast, private schools have the second lowest MMR vaccination rates.
Figure 3.1: Box plot of School’s MMR Vaccination Rates by School Types
Figure 3.1 reflects the distribution of the MMR vaccination rates across school types in a box plot, with type of school in the y-axis, and MMR vaccination rate in the x-axis. It is shown in the plot that most of the distribution of MMR vaccination rates across school types are skewed to the left. This means that school types, particularly public and private schools, have a considerable amount of outliers which values are small compared to the rest of the observations.
While the value of the MMR vaccination rates of private school are not the most varied, it is more dispersed compared to public school and BOCES school. The average (median) for the vaccination rate for private school is also well below the average (median) of overall school’s MMR vaccination rate in USA. Median (not mean) was chosen to estimate the average rate in this case as the dataset have a significant amount of outliers.
This is consistent with findings from a study conducted by Shaw (2014) where it was found that private schools have higher rates of exemptions for immunisations than public schools.
| School Type | School MMR Vaccination Rate (%) | School Overall Vaccination Rate (%) | Rate Differences (%) |
|---|---|---|---|
| Kindergarten | 94.20 | 87.99 | 6.21 |
| Private | 93.16 | 91.37 | 1.78 |
| Public | 95.90 | 94.51 | 1.38 |
In this section, the report will perform a comparative analysis between school’s 2018/2019 MMR and overall vaccination rates in USA. The summary of this comparison are shown in Table 3.2. Compared to the previous section, the table only reflects three school types. This is due to the fact that only three types of school that have observations of vaccination rates for both MMR and overall vaccination rates in the WSJ dataset. Table 3.2 shows that kindergartens have 6.21% difference in MMR and overall vaccination rates. In contrast, private and public schools have similar MMR and overall vaccination rates.
Figure 3.2: Box plot of Differences in School’s MMR and Overall Vaccination Rates by School Types
The distribution of these differences in vaccination rates are reflected in a boxplot and violin plot in Figure 3.2. Similar with the previous section, these distributions have a fair amount of outliers. Kindergartens have the most dispersed distributions, while private schools have the least. The distribution of the difference in vaccination rates in public schools have multimodality characteristics.
Vaccination rates across the state will be examined in this section. In particular, this section will explore the proportion of schools with MMR vaccination rates less than 95% across states in USA. According to California Department of Public Health, at least 95% of MMR vaccination rates needed to be achieved to prevent community disease transmission (Lambert and Willis 2019).
The measles data was grouped by state and then the proportion of schools with less than 95% MMR vaccination rate were calculated. Then, the map_data function was used to create a tibble containing the geographical information of each state. This data was merged with the measles_states data, which contains the proportion for each state. Any missing data or negative values were removed and the remaining data was plotted onto a map and bar chart using geom_polygon and geom_col, respectively. The ggplotly function was used to make the maps interactive.
As reflected in Figure ??, California and most of the Northeast region of the U.S. have a relatively low proportion of schools that have less than 95% MMR vaccination rate. It can be argued that there is no strong association between low MMR vaccination rates with geography. The proportion of schools with low vaccination rates appears to be varied randomly across the region.
Figure 3.3: Bar Chart of the Proportion of School with Less than 95% Vaccination Rates
Figure 3.3 reflects the proportion of school with low vaccination rates in a bar chart. It shows that there are 11 states which proportion is lower then the average states’ proportion. Arkansas, however, have a worryingly high proportion of schools with low vaccination rates at 99.65%. Arkansas only has 2 schools out of 567 that has MMR vaccination rates higher than 95%.
To analyse the average income of the states with the highest and lowest vaccination rate, an external dataset from U.S. Census Bureau was retrieved. This data was merged with the measles data grouped by state, and the top and bottom five observations were tabulated for both the vaccination rates.
| States | MMR Vaccination Rate (%) | Per Capita Income | Income Quantiles |
|---|---|---|---|
| Illinois | 97.62 | $28,105.72 | 2 |
| Connecticut | 96.49 | $41,021.25 | 4 |
| Massachusetts | 96.26 | $40,222.43 | 4 |
| South Dakota | 95.25 | $28,617.14 | 2 |
| Vermont | 95.19 | $31,966.00 | 4 |
| States | MMR Vaccination Rate (%) | Per Capita Income | Income Quantiles |
|---|---|---|---|
| Washington | 88.14 | $29,274.29 | 3 |
| Minnesota | 90.89 | $30,827.47 | 3 |
| Arizona | 91.38 | $23,459.40 | 1 |
| Maine | 92.24 | $28,983.25 | 2 |
| Texas | 92.32 | $27,504.22 | 1 |
Table 3.3 shows the top five states, with their respective income per capita, that have the highest rate of MMR vaccinations. The schools with the highest MMR vaccination rates are based on states with various level of income per capita. It ranges from USD 28,105 to USD 41,021. The table also highlights that on average, schools that have highest MMR vaccination rates are based on states with high and medium level of income per capita (quantile 2 and 4).
Table 3.4, on the other hand, shows the top five states that have the lowest rates of MMR vaccination. Similar to the previous table, these schools are based states with varying level of income per capita. The states with the highest income quantile, however, do not have the lowest MMR vaccination rate.
Figure 3.4: Scatter plot of School’s MMR Vaccination Rates by Income Per Capita
Figure 3.5: Box plot of School’s MMR Vaccination Rates by Income Quantiles
Figure 3.4 reflects that income per capita have varying effect to MMR vaccination rates across different states. While linear association can be easily determined in some of the states, the relationship of the two variables are difficult to ascertain in most of the states.
The distribution of MMR vaccination rates across the different income quantiles are plotted in Figure 3.5. The highest MMR vaccination rates occur in schools that are based on the lowest income quantile. The figure suggests that the vaccination rates are lower for schools that are based on highest income quantile. However, with the exception of quantile 1, the difference in the average (median) of the rates are small across the different income quantile.
## # A tibble: 18 x 3
## state mmr education_level
## <chr> <dbl> <dbl>
## 1 Illinois 97.6 33.8
## 2 Connecticut 96.5 34.0
## 3 Massachusetts 96.3 33.9
## 4 South Dakota 95.2 34.9
## 5 Vermont 95.2 37.1
## 6 California 94.9 27.7
## 7 Montana 94.6 36.4
## 8 Colorado 94.5 34.5
## 9 Oregon 94.3 30.9
## 10 New York 94.3 33.0
## 11 Utah 93.8 28.0
## 12 Ohio 92.8 36.2
## 13 North Dakota 92.8 33.5
## 14 Texas 92.3 29.2
## 15 Maine 92.2 38.2
## 16 Arizona 91.4 27.5
## 17 Minnesota 90.9 34.0
## 18 Washington 88.1 29.8
Figure 3.6: mmr vs education
The analysis has revealed that it is likely that there is no association with socioeconomic status and vaccination rates. Private schools, which are more expensive than public schools thus inferring greater socioeconomic status, have a lower average MMR and average overall vaccination rate than their public counterparts. The two states with the lowest MMR vaccination rate; Connecticut and Massachusetts, had the two highest average per capita incomes. Similarly, Washington had the lowest overall vaccination rate and one of the highest average per capita incomes. Therefore, it is unlikely that vaccination rate improves with socioeconomic status.
Alboukadel Kassambara (2020). ggpubr: ‘ggplot2’ Based Publication Ready Plots. R package version 0.4.0. https://CRAN.R-project.org/package=ggpubr
Barrabeig, I., Rovira, A., Rius, C., Muñoz, P., Soldevila, N., Batalla, J., & Domínguez, A. (2011). Effectiveness of measles vaccination for control of exposed children. The Pediatric Infectious Disease Journal, 30(1), 78–80.
C. Sievert. Interactive Web-Based Data Visualization with R, plotly, and shiny. Chapman and Hall/CRC Florida, 2020.
Cockcroft, A., Usman, M. U., Nyamucherera, O. F., Emori, H., Duke, B., Umar, N. A., & Andersson, N. (2014). Why children are not vaccinated against measles: a cross-sectional study in two Nigerian States. Archives of Public Health = Archives Belges de Sante Publique, 72(1), 48.
Commonwealth of Australia. (2020, May 27). Measles. Retrieved 25 August 2020, from https://www.health.gov.au/health-topics/measles#what-is-measles
Hadley Wickham and Dana Seidel (2020). scales: Scale Functions for Visualization. R package version 1.1.1. https://CRAN.R-project.org/package=scales
H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
Hao Zhu (2019). kableExtra: Construct Complex Table with ‘kable’ and Pipe Syntax. R package version 1.1.0. https://CRAN.R-project.org/package=kableExtra
Kamil Slowikowski (2020). ggrepel: Automatically Position Non-Overlapping Text Labels with ‘ggplot2’. R package version 0.8.2. https://CRAN.R-project.org/package=ggrepel
Mock, T. (2018). TidyTuesday–W weekly social data project in R. URL: https://github. com/rfordatascience/tidytuesday, 3.
Nicholas Tierney, Di Cook, Miles McBain and Colin Fay (2020). naniar: Data Structures, Summaries, and Visualisations for Missing Data. R package version 0.5.2. https://CRAN.R-project.org/package=naniar
Original S code by Richard A. Becker, Allan R. Wilks. R version by Ray Brownrigg. Enhancements by Thomas P Minka and Alex Deckmyn. (2018). maps: Draw Geographical Maps. R package version 3.3.0. https://CRAN.R-project.org/package=maps
Pebesma, E., 2018. Simple Features for R: Standardized Support for Spatial Vector Data. The R Journal 10 (1), 439-446, https://doi.org/10.32614/RJ-2018-009
Phadke, V. K., Bednarczyk, R. A., Salmon, D. A., & Omer, S. B. (2016). Association Between Vaccine Refusal and Vaccine-Preventable Diseases in the United States: A Review of Measles and Pertussis. JAMA: The Journal of the American Medical Association, 315(11), 1149–1158.
Queensland Health. (2019, October 22). What is measles and why do we vaccinate against it? Retrieved 25 August 2020, from https://www.health.qld.gov.au/news-events/news/what-is-measles-why-vaccinate#:~:text=The%20 easles%20vaccine%20contains%20a,is%20better%20prepared%20to%20respond
R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
Sinclair, D. R., Grefenstette, J. J., Krauland, M. G., Galloway, D. D., Frankeny, R. J., Travis, C., … Roberts, M. S. (2019). Forecasted Size of Measles Outbreaks Associated With Vaccination Exemptions for Schoolchildren. JAMA Network Open, 2(8), e199768.
Shaw, J., Tserenpuntsag, B., McNutt, L.-A., & Halsey, N. (2014). United States private schools have higher rates of exemptions to school immunization requirements than public schools. The Journal of Pediatrics, 165(1), 129–133.
Tim Appelhans, Florian Detsch, Christoph Reudenbach and Stefan Woellauer (2020). mapview: Interactive Viewing of Spatial Data in R. R package version 2.9.0. https://CRAN.R-project.org/package=mapview
VanAntwerp, T. (2016). TaxFoundation facts-and-figures. GitHub repository. Retrieved from https://github.com/TaxFoundation/facts-and-figures
Wickham et al., (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686
Yihui Xie (2020). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.29.
Yihui Xie (2015) Dynamic Documents with R and knitr. 2nd edition. Chapman and Hall/CRC. ISBN 978-1498716963
Yihui Xie (2014) knitr: A Comprehensive Tool for Reproducible Research in R. In Victoria Stodden, Friedrich Leisch and Roger D. Peng, editors, Implementing Reproducible Computational Research. Chapman and Hall/CRC. ISBN 978-1466561595
Kerr, Emma. 2019. “The Cost of Private Vs. Public Colleges.” U.S News, June.
Lambert, Diana, and Daniel J. Willis. 2019. “California Charter, Private Schools Report Lower Vaccination Rates Than Traditional Public Schools.” EdSource, August.